A Time-Sensitive Model for Microblog Retrieval
نویسندگان
چکیده
Microblog, as a way of online communication, can generate large amounts of information in a very short period. Therefore, how to retrieve the latest relevant information becomes a hot research area. Different from traditional information retrieval (IR), the microblog retrieval emphasizes fresh contents of the information. In order to solve this problem, we extend the traditional IR methods by taking into account the posting time. We propose a timesensitive retrieval model, which takes the time factor as a prior probability. In the retrieval model, we introduce the pseudo relevance feedback technology as a query expansion approach to improve retrieval performance. Furthermore, we introduce a strategy to filter the initial retrieval results, which takes post quality factors into account including entropy and link features. Experiments on Twitter corpus show that our algorithm is effective to improve the retrieval performance, and the retrieval results can meet the real time retrieval need well.
منابع مشابه
Learning to Rank Microblog Posts for Real-Time Ad-Hoc Search
Microblogging websites have emerged to the center of information production and diffusion, on which people can get useful information from other users’ microblog posts. In the era of Big Data, we are overwhelmed by the large amount of microblog posts. To make good use of these informative data, an effective search tool is required specialized for microblog posts. However, it is not trivial to d...
متن کاملTime-Sensitive Weighting for Microblog Retrieval
We report our system and experiments for the realtime Adhoc task in the 2011 MicroBlog track. Our goal is to develop effective technique to retrieve relevant tweets that have been posted recently. In particular, we propose a time-sensitive term weighting strategy that can favor tweets in hot-discussed time and a document length related weighting method that can favor long tweets which are mor...
متن کاملBurst-aware data fusion for microblog search
We consider the problem of searching posts in microblog environments. We frame this microblog post search problem as a late data fusion problem. Previous work on data fusion has mainly focused on aggregating document lists based on retrieval status values or ranks of documents without fully utilizing temporal features of the set of documents being fused. Additionally, previous work on data fusi...
متن کاملImproving Microblog Retrieval from Exterior Corpus by Automatically Constructing Microblogging Corpus
A large-scale training corpus consisting of microblogs belonging to a desired category is important for highaccuracy microblog retrieval. Obtaining such a large-scale microblgging corpus manually is very time and laborconsuming. Therefore, some models for the automatic retrieval of microblogs from an exterior corpus have been proposed. However, these approaches may fail in considering microblog...
متن کاملQuery Expansion Based on a Feedback Concept Model for Microblog Retrieval
We tackle the problem of improving microblog retrieval algorithms by proposing a Feedback Concept Model for query expansion. In particular, we expand the query using knowledge information derived from Probase so that the expanded one could better reflect users’ search intent, which allows for microblog retrieval at a concept-level, rather than termlevel. In the proposed feedback concept model: ...
متن کامل